Skip to content

Issues: huggingface/trl

Beta
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Issues list

LORA Continuos pre-training on 7B Instruct Model ✨ enhancement New feature or request ⚡ PEFT Related to PEFT 🏋 SFT Related to SFT
#3509 opened May 29, 2025 by sinchanabhat
High KL Divergence in GRPO with GPT2-Style Model (Due to Dropout?) 🏋 GKD Related to GKD 🏋 GRPO Related to GRPO 🏋 SFT Related to SFT
#3500 opened May 27, 2025 by cliang-huanglab
Converting a conversational dataset into a standard dataset [not working] 🐛 bug Something isn't working
#3490 opened May 23, 2025 by nbasyl
5 tasks done
Vision Fine Tuning Gemma 3 takes Impossiblily High VRam (OOM Error 8xH200) ⚡accelerate Related to accelerate 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#3481 opened May 22, 2025 by amanmehra89
5 tasks done
[GPG][new trainer] Add support to new GPG method ✨ enhancement New feature or request
#3472 opened May 20, 2025 by lerogo
3 tasks done
[GRPO] bnb quantization + vllm 🐛 bug Something isn't working 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#3466 opened May 18, 2025 by shon-otmazgin-wix
5 tasks done
Turn off Accelerate acceleration ⚡accelerate Related to accelerate 🏋 GRPO Related to GRPO
#3461 opened May 17, 2025 by seTalent
Out of Memory when GRPO fine-tune Qwen3 4B model on 80G A100 GPU 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#3456 opened May 16, 2025 by wa008
5 tasks done
Reward_model error in cli GRPO train when I only use reward function rather than reward_model 🐛 bug Something isn't working 📱 cli Related to the Command-line interface 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#3455 opened May 16, 2025 by wa008
5 tasks done
PPO training fails when used with accelerate ⚡️ and Deepspeed 🚀 ⚡accelerate Related to accelerate 🚀 deepspeed Related to deepspeed 🏋 PPO Related to PPO 🏋 SFT Related to SFT
#3453 opened May 16, 2025 by marcellobullo
5 tasks done
GRPO reward=0 and loss=0 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#3452 opened May 15, 2025 by LIUyizheSDU
torch distributed training with multi gpus errors in GRPOtrainer 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#3451 opened May 15, 2025 by jinhonglu
5 tasks done
trl vllm-serve not working on latest. 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#3450 opened May 15, 2025 by tcapelle
5 tasks done
[GRPO] num_generations 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#3443 opened May 13, 2025 by shon-otmazgin-wix
5 tasks done
Unstructured data grpo training 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#3441 opened May 13, 2025 by yuyuhua918
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.